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Abstract CD1, as the third family of antigen-presenting mol- 
ecules, is previously only found in mammals and chickens, 
which suggests that the chicken and mammalian CD1 shared a 
common ancestral gene emerging at least 310 million years 
ago. Here, we describe CD/ genes in the green anole lizard 
and Crocodylia, demonstrating that CD1 is ubiquitous in 
mammals, birds, and reptiles. Although the reptilian CD1 pro- 
tein structures are predicted to be similar to human CD1d and 
chicken CD1.1, CD1 isotypes are not found to be orthologous 
between mammals, birds, and reptiles according to phyloge- 
netic analyses, suggesting an independent diversification of 
CD1 isotypes during the speciation of mammals, birds, and 
reptiles. In the green anole lizard, although the single CD1 
locus and MHC I gene are located on the same chromosome, 
there is an approximately 10-Mb-long sequence in between, 
and interestingly, several genes flanking the CD1 locus belong 
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to the MHC paralogous region on human chromosome 19. 
The CD1 genes in Crocodylia are located in two loci, respec- 
tively linked to the MHC region and MHC paralogous region 
(corresponding to the MHC paralogous region on chromo- 
some 19). These results provide new insights for studying 
the origin and evolution of CD1. 


Keywords CD! - Reptile - Isotype - Evolution 


Introduction 


CD1, which is a family of antigen-presenting molecules, can 
bind bacterial and autologous lipid, glycolipid, and 
lipopeptide antigens for presentation to T and NKT cells 
(Brigl and Brenner 2004; Jayawardena-Wolf and Bendelac 
2001; Matsuda and Kronenberg 2001; Moody et al. 2004; 
Porcelli and Modlin 1999). Although CD1 is related to both 
MHC class I and class II molecules (Koch et al. 2005; Martin 
et al. 1986; Porcelli 1995), CD1 is structurally more closely 
related to MHC class I molecules due to high sequence iden- 
tity, similar domain organization, and association with B2- 
microglobulin ({2m) (Calabi and Milstein 1986; Martin 
et al. 1986; McMichael et al. 1979). However, CD1 is func- 
tionally more similar to MHC class II, as the tissue distribution 
of CD1 is highly restricted (Brigl and Brenner 2004; Dougan 
et al. 2007). CD1 molecules are expressed on the surface of 
antigen-presenting cells, and most CD1 proteins appear to be 
localized to endosomal MHC II compartments, in which the 
MHC II molecules are thought to be loaded with exogenous 
antigens (Sugita et al. 1996). 

The mammalian CD1 family is composed of five 
nonpolymorphic genes (CD/A, CD1B, CDIC, CDID, and 
CD1E), which are categorized into three groups based on their 
genomic organization, sequence identity, and cellular 
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functions: group 1 (CDla, CD1b, and CD1c), group 2 
(CD1d), and group 3 (CD1e) (Adams and Luoma 2013; Brigl 
and Brenner 2004). The sequence similarity is substantially 
higher for the same isotypes from different species than that 
for different isotypes within the same species (Porcelli 1995), 
suggesting that each group of CD1 molecules has a different 
function and present different antigens. The CD1 isotypes are 
differentially expressed with restricted tissue distribution and 
can interact with both yd and «xf T cells (Balk et al. 1991; 
Castano et al. 1995; Jayawardena-Wolf and Bendelac 2001; 
Moody et al. 2004; Porcelli et al. 1989; Porcelli and Modlin 
1999). Despite the functional importance of CD1 in mammals, 
non-mammalian CD1s have previously been only found in 
chickens, which have two CD/ genes that are comprised of 
two isotypes (Maruoka et al. 2005; Miller et al. 2005; 
Salomonsen et al. 2005). 

The genomic locations of CD1 genes are different between 
mammals and chickens, which may reflect the origin and evo- 
lution of CD1 to some degree. In mammals, the CD1 and 
MHC gene loci are located on different chromosomes 
(Albertson et al. 1988; Calabi and Milstein 1986; Dascher 
and Brenner 2003). For example, in humans, CD1 and 
MHC genes are located on two paralogous chromosomes, 
chromosome 1 and chromosome 6, respectively. However, 
the chicken CD/ genes are closely linked with MHC genes 
in the same region (Maruoka et al. 2005; Miller et al. 2005; 
Salomonsen et al. 2005). 

It is now commonly accepted that CD1 genes originated 
from MHC I in some stage of vertebrate evolution (Dascher 
2007; Kasahara 1999; Martin et al. 1986; Maruoka et al. 
2005; Miller et al. 2005; Salomonsen et al. 2005). However, 
it remains still controversial how and when this occurred. 
Currently, three models have been proposed to explain the 
associated evolution of CD1 with MHC genes: 1, Class I 
genes duplicate to give class I and CD1 genes in the primor- 
dial MHC gene locus, both of which are then distributed in 
different MHC paralogous regions during 2R (two rounds of 
whole genome duplication in vertebrate evolution) (Holland 
et al. 1994; Ohno 1970), followed by differential silencing 
of MHC and CD1 genes in different paralogous regions for 
different lineages (Salomonsen et al. 2005). 2, class I genes 
in the primordial MHC are distributed in different MHC 
paralogous regions during 2R, followed by evolution of 
the class I gene to CD1 in one paralogous region, retention 
of class I gene in another paralogous region and silencing of 
the class I genes in the other two paralogous regions 
(Kasahara 1999). 3, class I genes in the primordial MHC 
are distributed in different MHC paralogous regions during 
2R, followed by retention of class I genes in two paralogous 
regions and silencing of the class I genes in the other two 
paralous regions, and then evolution of class I gene to CD1 
in one paralogous region in a close ancestor of birds and 
mammals (Miller et al. 2005). 
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In this paper, we describe several reptilian CD/ genes that 
are homologous to mammalian and chicken CD1. The results 
revealed that reptiles express distinct CD 1 isotypes that are not 
orthologous to mammalian or chicken CD1, suggesting that 
the CD1 isotypes formed independently in the distinct species 
during speciation. The analysis of the genomic locations of 
reptilian CD1 showed features that are both identical to and 
different from mammalian and chicken CD1. These results 
provide a new opportunity to trace the origin and evolution 
of CD1. 


Materials and methods 


Animals, DNA and RNA isolations, and reverse 
transcription 


Approximately 3-year-old green anole lizards (Anolis 
carolinensis) were purchased from a local pet market in Bei- 
jing. The Siamese crocodile (Crocodylus siamensis) was pur- 
chased from a crocodile breeding farm in Tianjin, and the 
Chinese alligator (Alligator sinensis) tissue samples were col- 
lected from the Anhui Research Centre for the Reproduction 
of the Chinese Alligator. The genomic DNA was isolated 
using a standard phenol-chloroform extraction method. The 
total RNA from the different tissues was prepared using a 
TRIzol kit (Tiangen Biotech, Beijing, China). The reverse 
transcription was conducted using M-MLV reverse transcrip- 
tase following the manufacturer’s instructions (Invitrogen, 
Beijing, China). 


Amplification of the conserved CD1 cDNA fragments 
using degenerate primers 


Two degenerate primers (CDIF: 5'-CCY RTK GCT GTG 
GTC TTT GCC C-3’; CDIR: 5'-CTS CKG AKC TGG TAG 
GTS AGG TCG-3’) were designed according to the previous- 
ly reported chicken CD1 cDNA sequences (Miller et al. 2005) 
and the green anole lizard CD1 sequence identified in this 
study. RT-PCR using reptile spleen cDNA was carried out 
under the following conditions: 95 °C for 5 min; 35 cycles 
of 95 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s; and a final 
extension at 72 °C for 7 min. The polymerase used was the 
LA-Tag DNA polymerase (Takara, Dalian, China). The resul- 
tant PCR product was cloned into the pMD19-T vector 
(Takara, Dalian, China) and sequenced. 


Amplification of the complete cDNA sequences 


We used the 3’ RACE System for Rapid Amplification of 
cDNA Ends (Invitrogen, Beijing, China) for the 3'-end ampli- 
fication. The RACE PCRs were performed according to the 
manufacturer’s instructions. Two primers were derived from 
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the conserved sequences for the nest-PCR reaction for the 
Siamese crocodile (CrsiF1: 5'-TTT GCC GGG TCA CTG 
GCT TC-3'; CrsiF2: 5'-TGC GGG ATG GTG AGG AGG 
TG-3’) and the green anole lizard (AncaF1: 5'-GGA GCC 
TCC TGC AAC CAC TG-3’; AncaF2: 5'-CCT TGC CCA 
GCT CAA GGA TC-3’). The PCRs were performed using 
total spleen cDNA under the following conditions: 95 °C for 
5 min; 35 cycles of 95 °C for 30 s, 60 °C for 30 s, and 72 °C 
for 90 s; and a final extension at 72 °C for 7 min. The poly- 
merase used was the LA-Taq DNA polymerase. The resultant 
PCR products were cloned into pMD19-T and sequenced. We 
used the 5' RACE System for Rapid Amplification of cDNA 
Ends (Invitrogen, Beijing, China) for the 5'-end amplification 
with three specific primers that were designed using the 
cDNA sequence obtained from the Siamese crocodile 
(CrsiCD1.1gsp1: 5'-GCC ACG ATG AGA AGT GTG AC- 
3'; CrsiCD1.1gsp2: 5'-TGC TGT GTT CCA CAT GAC AG- 
3’; CrsiCD1.1gsp3: 5'-CAG CTG GTA GGT CAG ATC TG- 
3’; CrsiCD1.2gsp1: 5'-CTG CAG CCA ACA ATG AAA TG- 
3’; CrsiCD1 .2gsp2: 5'-CGC AGC TGG TAG GTC AGG TC- 
3’; CrsiCD1.2gsp3: 5'-GCA GCC AGG TCA TGT GAA TG- 
3’) and the green anole lizard (AncaCD1gsp1: 5'-CTG CAG 
ATA GAA TAA CAC TC-3’; AncaCD1gsp2: 5'-TCC CAC 
AGG ATG ACA AGA CT-3'; AncaCD1 gsp3: 5’-CCT GCA 
AAC ATA ACT GTG TG-3’). Gsp1 was used to synthesize 
the first-strand cDNA. The RACE PCRs were performed ac- 
cording to the manufacturer’s instructions. The resultant PCR 
products were cloned into the pMD19-T vector and se- 
quenced. We designed specific primers based on the products 
of the 3’RACE and 5’RACE to amplify the entire Siamese 
crocodile CD1 cDNA sequences (CrsiCD1.1F: 5'-AGA 
AGC CCC TCC AAA GCC TG-3'; CrsiCD1.1R: 5'-AAT 
GGA AGA AGG AGA GAA TC-3’; CrsiCD1.2F: 5'-TGC 
GAT GAT GCA GCA GCT TCC-3’; CrsiCD1.2R: 5'-CCT 
CCG TAA CTG AGA GAG AAC-3’) and the green anole 
lizard CD1 cDNA sequences (AncaCDIF: 5'-TGG CCT 
GCA GAT ATT TCC TG-3’; AncaCDI1R: 5'-CTG CTT 
TAG ATG AAC TTA AG-3’). Additionally, primers 
(AlsiCD1.1F: 5'-CCA GAG CAT GCT GCC TCC TCT-3’; 
AlsiCD1.1R: 5'-ATC CAC TGC TTT ATA ACA CAC-3’; 
AlsiCD1.2F: 5'-TGC TCG CCT TCC CCA TGT CAT-3'; 
AlsiCD1.2R: 5'-CCT GGT CTT GCT TAG TTC AAG-3’) 
were designed for the amplification of two transcribed Chi- 
nese alligator CD1 genes. 


Southern blotting 


The «1 domain-encoding sequences of the Siamese crocodile 
and green anole lizard CD1 were used as probes for Southern 
blotting. These cDNA fragments were labeled using a PCR 
DIG Probe Synthesis Kit (Roche, Beijing, China) using the 
following primers: CrsiCD1.1pF: 5'-TGC GTC TGC TGC 
AGA CCA TC-3’; CrsiCD1.1pR: 5’-CCC ATA GAG CAC 
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TGA GTC AC-3’; CrsiCD1.2pF: 5'-TCC CCC TCC CTT 
GCC TCT TT-3’; CrsiCD1.2pR: 5'-CCA CGG TGA CAA 
CTG TAT TG-3'. AncaCDIpF: 5'-GTC CAT CCA GCC 
TTC TTT CA-3’; 5'-CTG CAA CCA TGT TGT TGA TG- 
3’. The hybridization and detection were performed using the 
DIG High Prime DNA Labeling and Detection Starter Kit II 
(Roche, Beijing, China), following the manufacturer’s 
instructions. 


Detection of the gene expression in different Siamese 
crocodile tissues via quantitative real-time PCR 


The cDNA samples from seven tissues (heart, liver, spleen, 
lung, kidney, small intestine, and stomach) were used to detect 
the CD1 expression in the Siamese crocodile via quantitative 
real-time PCR. The PCRs were performed using a 
LightCycler 480 and the LightCycler 480 SYBR Green I Mas- 
ter Mix (Roche, Beijing, China). Each sample was run in 
triplicate. The Siamese crocodile EF/al gene was chosen as 
the internal control. The PCRs were performed under the fol- 
lowing conditions: 95 °C for 10 min; 35 cycles of 95 °C for 
10 s, 60 °C for 20 s, and 72 °C for 15 s; and a final extension at 
72 °C for 7 min. The PCR primers were as follows: 
CrsiCD1.1F2: 5'-CTC AGG CAA GTG GGT AGC TC-3’; 
CrsiCD1.1R2: 5'-TGT GTC AAT TGT GCC CTT GT-3'; 
CrsiCD1.2F2: 5'-TGC AGT TCC TGC TCC AGA AC-3’; 
CrsiCD1.2R2: 5'-TCC TGC CTC TTC AGT GTC TC-3’ 
and EFlalF: 5'-TGA TGC TCC TGG ACA CAG AG-3’; 
EF lalR: 5'-GCC CAT TCT TGG AGA TAC CA-3’, 


Sequence alignments, comparisons, three-dimensional 
structural modeling, and construction of the phylogenetic 
tree 


MegAlign (DNAStar/Lasergene) (Hein 1990) was used for 
the sequence comparisons and identity calculations. The 
three-dimensional structures of the reptile CD1 were predicted 
via SWISS-MODEL (http://swissmodel.expasy.org/). The 
PDB files used in analysis are 1ZT4 (human CD1d), 3JVG 
(chicken CD1.1), and 3DBX (chicken CD1.2). PYMOL was 
used to display the cartoon representation. The phylogenetic 
tree was made using MrBayes3.1.2 (Ronquist and 
Huelsenbeck 2003) and viewed in TreeView (Page 1996). 
Multiple sequence alignments were performed using Clustal 
X1.83 (Thompson et al. 1997). The accession numbers of the 
sequences used for the comparisons and constructions were as 
follows: human (Homo sapiens) huCD1a: NP_001754.2; 
huCD1b: NP_001755.1; huCD1c: NP_001756.2; huCD1d: 
NP_001757.1; huCDle: CAA33100.1; HLA-A: NP_ 
002107.3; HLA-B: NP_005505.2; HLA-C: NP_002108.4; 
HLA-DRB1: NP_002115.2; HLA-DRB3: NP_072049.2; 
northern brown bandicoot (Isoodon macrourus) IsmaCD1: 
ABI99485.1; chicken (Gallus gallus) chCD1.1: AAX49403. 
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1; chCD1.2: AAX49406.1; BF1: NP_001038148.1; BF2: 
NP_001026509.1; BLB1: NP_001038159.1; BLB2: NP_ 
001038144.2; Xenopus (Xenopus laevis) Xela-la: 
AAA16064.1; Xela-IIb: NP_001108243.1; zebrafish (Danio 
rerio) DareDEB: NP_571552.1; Dare-UBA: NP_571546.1; 
nurse shark (Ginglymostoma cirratum) Gici-UAAOLI: 
AAF66110.1; Gici-IIb1: AAF82681.1; green anole lizard 
(Anolis carolinensis) AncaCD1: KJ191193; Chinese alligator 
(Alligator sinensis) AlsiCD1.1: KJ191190; AlsiCD1.2: 
KJ191189; Siamese crocodile (Crocodylus siamensis) 
CrsiCD1.1: KJ191192; CrsiCD1.2: KJ191191; spectacled cai- 
man (Caiman crocodilus) Cacrla: AHC72441.1; CacrIIb: 
AAF99284.1; inshore hagfish (Eptatretus burgeri) IgSF3: 
BAE93396.1. 


Results 
Identification of the CD/ genes in certain reptiles 


Reptiles and birds belong to the Reptilia and share a common 
ancestor approximately 220 million years ago (Mya) (Kumar 
and Hedges 1998). As the CD/ gene has been identified in 
chickens (Miller et al. 2005; Salomonsen et al. 2005), likewise 
the CD/ gene is highly likely also present in reptiles. Using 
the chicken CD1 amino acid sequences, we searched the green 
anole lizard genomic databases in Ensemble (Assembly 
AnoCar2.0) and discovered a homologous sequence at posi- 
tion 189410454—189410738 in chromosome 2. Upon se- 
quence alignment, we found that the sequence was more sim- 
ilar to the CD/ than the MHC I gene. 5' RACE and 3’ RACE 
were thus performed to obtain a complete cDNA sequence for 
the green anole lizard CD1. Based on the green anole lizard 
and chicken CD1 sequences, a pair of degenerate primers was 
designed to screen for CD/ genes in other reptiles. The PCR 
reactions were performed using spleen-derived cDNA from 
the red-eared turtle (Trachemys scripta elegans), Siamese 
crocodile (Crocodylus siamensis), beauty snake (Orthriophis 
taeniurus), and Burmese python (Python bivittatus). Because 
of the non-specificity of the primers, a homologous sequence 
was only amplified in the Siamese crocodile, and two similar 
full-length cDNA sequences were gained via the 5' RACE and 
3’ RACE. 

Using the human, chicken, green anole lizard, and 
Siamese crocodile CD1 sequences, we searched the Chi- 
nese alligator (Alligator sinensis) genomic database and 
the other available reptilian genomic databases. Several 
predicted CD1 sequences and CD1-like sequences were 
found from the genomic data on NCBI, including those 
in the Chinese alligator (Alligator sinensis), American 
alligator (Alligator mississippiensis), Burmese python 
(Python bivittatus), green sea turtle (Chelonia mydas), 
Chinese soft-shell turtle (Pelodiscus sinensis), and 
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western painted turtle (Chrysemys picta) (Supplemental 
table 1). The majority of the CD1 sequences were not 
used for further analyses because some predicted se- 
quences were overlong or incomplete. Three Chinese 
alligator CD1 sequences were identified in the genomic 
database. RT-PCR using spleen-derived cDNA was sub- 
sequently employed to confirm whether they are tran- 
scriptionally functional. The results showed that two of 
the three sequences were transcribed, and the third one 
was not detected via the RT-PCR. For the two tran- 
scribed sequences, one was functional, while the other 
one seemed to be a pseudogene due to 13 missing nu- 
cleotides in «2 encoding exon, leading to a premature 
stop codon. 


Phylogenetic analysis of the CD1, MHC I, and MHC IT 
genes 


In comparison with the full-length chicken and human 
CD1 and MHC I amino acid sequences, the amino acid 
sequence identities of the full-length reptilian CD1 se- 
quences for the green anole lizard, Siamese crocodile, 
and Chinese alligator are ~25.9-38.3 % with the CD1 
sequences and ~20.1—25.3 % with the MHC I se- 
quences, while a comparison with the conserved «3 
domain showed that the identities are ~36.6-65.3 % 
and ~24.2-40.1 %, respectively (data not shown). 

To deduce the phylogenetic relationships of the rep- 
tilian CD1 genes with CD1, MHC I, and MHC II genes 
in other species, we used human, chicken and reptile 
CD1, and the full-length fish, Xenopus laevis, reptile, 
chicken, and human MHC Ia and MHC Ib amino acid 
sequences to perform phylogenetic analyses. The phylo- 
genetic analyses were performed independently using 
three methods including Bayesian, neighbor-joining, 
and maximum likelihood, and these analyses generated 
trees with a very same topology. The results strongly 
supported that the identified sequences in reptiles were 
CD1 genes, as they formed a unique clade with CD1 
but not MHC genes from other species (Fig. 1, 
Supplemental Fig. 1). 

The phylogenetic analysis also revealed that the crocodil- 
ian CD1 should be divided into two distinct isotypes, but the 
crocodilian isotypes are distinct from the chicken and human 
CD1 isotypes. Meanwhile, it was also revealed that the green 
anole lizard CDI was not orthologous to any of the 
Crocodylia, chicken, or human CD1 isotypes. We therefore 
designated the detected reptilian CD/ genes as AncaCD1 
(GenBank No: KJ191193) for the green anole lizard, 
CrsiCD1.1 (GenBank No: KJ191192) and CrsiCD1.2 
(GenBank No: KJ191191) for the Siamese crocodile, and 
AlsiCD1.1 (GenBank No: KJ191190) and AlsiCD1.2 
(GenBank No: KJ191189) for the Chinese alligator, which 
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Fig. 1 Phylogenetic tree of the full-length amino acid sequences of fish, 
amphibian, reptile, bird, and mammalian CD1, MHC I, and MHC II. The 
phylogenetic tree was constructed using MrBayes3.1.2 and is viewed in 
TreeView. The credibility value for each node is shown. The inshore 
hagfish (Eptatretus burgeri) immunoglobulin superfamily 3 gene 
(igSF3) (AB242223), which has an Ig-like domain that is similar to 
MHC and CD1, was used as the outgroup in the phylogenetic analysis 


was based on the MHC and chicken CD1 nomenclature (Klein 
et al. 1990; Miller et al. 2005; Salomonsen et al. 2005). 


Southern blotting and expressional analyses 
of the reptilian CD1s 


A Southern blotting was performed using the CrsiCD1.1 
and CrsiCD1.2 «1 domain-encoding sequences as 
probes (Fig. 2a, b). The results showed several bands 


CrsiCD1.1 


CrsiCD1.2 


Fig. 2. Southern blotting of the Siamese crocodile, Chinese alligator, and 
green anole CD1 genes. The probes were designed based on the 
CrsiCD1.1 and CrsiCD1.2 «1 sequences and were used for both the 
Siamese crocodile and Chinese alligator CD1 due to the high similarity 
between the «1 sequences of the two crocodiles. The probe for the green 
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for CrsiCD1.2. AlsiCD1.] and AlsiCD1.2, suggesting 
that there are several copies of these genes in the ge- 
nome. In contrast, no more than two bands were ob- 
served for CrsiCD1.1 after digestion with each of dif- 
ferent restriction enzymes, suggesting that this gene is 
likely a single copy gene. Another Southern blotting 
analysis was performed using the AncaCD1 al se- 
quence as a probe (Fig. 2c). The results showed that, 
likely, green anole lizard CD1 has only one copy of the 
CDI gene. 

To analyze the crocodilian CD1 expressional pattern, 
we designed two pairs of qRT-PCR primers according to 
the full-length CrsiCD1.] and CrsiCD1.2 cDNA se- 
quences. The qRT-PCRs were performed using cDNA 
from seven tissues from the Siamese crocodile 
(Fig. 3). The results showed that the highest level of 
expression of both CrsiCD1.] and CrsiCD1/.2 is in the 
spleen. Both CrsiCD1.1 and CrsiCD1.2 displayed low 
expression levels in the other tissues. 

We observed multiple bands in the RT-PCR of AlsiCD/ 
from total RNA, and the bands were cloned and sequenced. 
The results showed that both A/siCD/./ and AlsiCD1.2 genes 
expressed multiple transcripts; A/siCD/.1 has six different 
transcripts (X1 to X6), and A/siCD1.2 has four (X1 to X4) 
expressed in the spleen, lung, and small intestine. These dis- 
tinct transcripts should have arisen from RNA splicing, since a 
strict RNA splicing rule (i.e., GT-AG splicing site, in rare case 
GC-AG) was observed when they were aligned with their 
respective genomic sequences (Supplemental Fig. 2). The 
schematic splicing patterns of all variants for both A/siCD/.1 
and AlsiCD1.2 are shown in Fig. 4. Perhaps because of the 


10kb 
8kb 
— 6kb 
— 5kb 


— 4kb 
— 3kb 


— 2kb 


— 1kb 


AlsiCD1.2 


AncaCD1 


anole lizard CD1 was designed using its «1 sequence. Four restriction 
enzymes were used for each Southern blotting and are indicated at the top. 
a The Southern blotting result of the Siamese crocodile CD/ genes. b The 
Southern blotting result of Chinese alligator CD/ genes. e The Southern 
blotting result of green anole lizard CD/ genes 
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Fig. 3 The tissue expression of CrsiCD1.1 and CrsiCD1.2. The 
expression levels of CrsiCD1.1 and CrsiCD1.2 were examined via 
qRT-PCR. The Siamese crocodile eEF'/A/ gene was used as an internal 
control. The seven tissues are listed under the x-axis. The y-axis indicates 
normalized expression folds. a The expression levels of CrsiCD1.1 in 
different tissues. b The expression levels of CrsiCD1.2 in different tissues 


missing 13 nucleotides in the «2 encoding exon, all the 
spliced variants for AlsiCD/./ involve only the «2 exon. In 
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contrast, more exons in Al/siCD/.2 are included in splicing 
events (Fig. 4a, b). 

Similarly, in the spleen, CrsiCD/.2 expresses two alterna- 
tively spliced forms, one containing a partial «2 domain and 
the other lacking the transmembrane region (Fig. 4c, Supple- 
mental Fig. 2). CrsiCD1.1 does not have any alternative splic- 
ing variants. 


Comparison of the reptilian CD/ genes with the human 
and chicken CD/ genes 


We aligned the full-length reptilian CD1 amino acid se- 
quences with those of two chicken CD1 and five human 
CD1 for comparison (Fig. 5). The sequence identities are 
~50.6-85.7 % among all of the crocodilian CD1 sequences, 
~25.9-30.6 % between the reptile and human CD1, and 
~33.9-38.3 % between the crocodilian and chicken CD1. 
The conserved «3 domain show greater similarities than the 
full-length CD1 between reptiles, chickens, and humans (data 
not shown). The conserved cysteines that exist in chCD1.1 
(C98—-C163, C202-C260) and the human CD1s, with the ex- 
ception of chCD1.2 (Miller et al. 2005; Salomonsen et al. 
2005), also exist in most reptilian CD1s, but one of them is 
missing in AlsiCD1.1, obviously due to a sequence deletion 
(Fig. 5). 

The N-linked N-X-(S/T) glycosylation sites in human CD1 
and the reptilian CD1 were predicted using NetNGlyc (http:// 
www.cbs.dtu.dk/services/NetNGlyc/). We found nine clusters 
of glycosylation sites in 12 CD1 sequences, which are marked 
N1 to N9 (Fig. 5). N1 is highly conserved in the human, 
chicken, and green anole lizard, but not in crocodiles, 
whereas N9 is conserved in three reptiles but not in human 
and chicken. The cytoplasmic tails of human CD 1b, -c and -d 
contain a positively charged membrane anchor followed by 
the sequence SYQ (huCDlb, SYQNIP; huCDlc, SYQDIL; 


AsiCD1.1-X1 M SLI 1149bp 
AlsiCD1 1X2 —— SL 81%bp 
AlsiCD1.1-X3 a S  1066bp 
A\siCD1 1X4 “OTST APPMVTVA[VTV”T+q+.01.-"— | 77 1 D 
AlsiCD 1. d- X S A | 022b 
AsiCD1.1-X¢ M D as 
A\siCD1.1 mmm : : p a ISLI 1034bp 
H= + + + + 
L ai a2 a3 TM CYT 
AlsiCD1 2X1 MMS | 669bp 
AlsiCD1 2X2 OO | 8975) 
AlsiCD1 2-X3 | 486bp 
AlsiCD 1 2-X4 =| 58 1b 
LI 1062bp 


Fig. 4 A schematic diagram showing the alternative splicing variants in 
Chinese alligator and Siamese crocodile CD1. AlsiCD1.1 and AlsiCD1.2 
cDNA fragments from the spleen, lung, and small intestine, CrsiCD1/.2 
cDNA fragments from the spleen were cloned following RT-PCR. Ten to 
30 clones from each tissue were sequenced and aligned. The identical 
residues are in indicated in black, and the missing or inserted nucleotides 
from the sequenced clones are indicated in white or light gray, 
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respectively. The alternative splicing forms derived from the different 
tissues are indicated by Zetters to the right of the schematic diagram; S 
spleen, L lungs, and / small intestine. a Seven PCR products were 
observed: the full-length AlsiCD1.1 is on the bottom. b Five PCR 
products were observed: the full-length AlsiCD1.2 is on the bottom. c 
Two PCR products were observed: the full-length CrsiCD1y2 is on the 
bottom 
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a1 helix 
FFHAFFFHFFAFAFEF FF FEFFF P44 444 


AlsiCD1.1 (p): AASPPVPTGSGSLHLOQOTI IFQEPGK-AEVWG-LALVWDVETHTLDCATCPIRFLQPWAQSAIS PEHWHDLELLIHLYLANFIHOVNVWVQQEGLR 
GrsreDiied. : AASPPVPTGPGSLRLLOTIVFQDAGK-AEMLG-LALVEDVETCTLDCATCPIRFLQPWAQSAIS PEHWRDLOLLIHLYLANFIHQVNLWAQQEGFS 
AlsiCD1.2 MPPLPAPME POT LRLLATMVFHDTTRAADSQG-TALLGDVPTHTMDCGTCPIRFHQPWAHQGLS PKQWHDLEQAT HLYLGLITDTVVSVVQOGTGVS 
CrsiCb1. 2 : MPSLPPPGEPOQILOLLTTKVFHDTTRAADTQG-TALLGDVPTHAMDCGTCPIRFHQPWAROQGLS PKQWGDLEKAIHLYLALITNTVVTVVOQOGTGVS 
AncaCD1 : AAFFHLPAVLWPFRMLOTISFQNTSA-TEIMGTIAFLGDVETHSLDTHTWKIKFLQPWTQSAFTPLKWEMLGOQLFRASFI DEKKAINNMVAASNY S| 
chep. i1 : PGTTAEPEGSHMLKLLHFATFQNSTIS-VLVGG-LGLLGDVKMGS LDSRTGNIRYYRPWLRPSLPKGDWDVIESSIKSYVRDFSRLVOMYTTVP--- 
GhCD1.2 : ETSCPPPEESQFFQLFYTLLLGNVS|S-TELTG-MALLADVP IMVLDPHTWNLNI CR PWVQEITAETEVKKILS FSMVGIRNTIRFMHEMTAKAGLD 
huCDla : DGNADGLKEPLSFHVTW —-KONLV-SGWLSDLOTHTW DSNSgT VFLCPWSRGNFSNEEWKELETLFRIRTIRSFEGIRRYAHELOFE 
huCD1b : GNSEHAFQGPTSFHVIQT -AQTQG-SGWLDDLQIHGWDSDSGTAIFLKPWSK pesprevactee FRVYIFGFAREVQDFAGDFQMK 
huCDic : GDNADASQEHVSFHVIQ -ARGQG-SGWLDELQTHGWDSESGTIIFLHNWSKGNESNEELS DLELLFRFYLFGLTREIQDHASQDYSK 
huCD1d : WGSAEVPQRLFPLRCLO -TRTDG-LAWLGELOTHSWSNDSDTVRSLKPWSQGTFSDQQWETLQHIFRVYRSSFTRDVKEFAKMLRLS 
huCDle : GENTAAAEEQLSFRMLQ1 ~AHSEG-SGWLGDLOTHGWDTVLGTIRFLK PWSHGNFSKOELKNLOQSLFOLYFHSFIQIVOQASAGOFOQLE 
a2 helix ne 
FEPEETETEE EEE E TEETH EEE Pt 
AlsicD1.1 (p) DPEVTQCSVGCELL meia i DLVSFSSG--KWVAQRQDKLALHVRDSLNRDKGTADTLENLLNH*****LQILLRNGKEVLERQ 
GrsicDi.1 YPFVTQCSMGCELLPSGASWGAYINAGLGGE DLVSFSSG--KWVAQRODKLALHVODSLNRDKGT I DTMENLIINHTC I QDLOTLLRNGKEVLERQ 
AlsiCD1.2 FPFVIQILMGCEILHNGTSYSFYLSTRDRDDLVRFNLATGEWVAAPGDKMAQRVCRSFSQDOGTSSRLRFLLOYTCVTETQSFAY YGKETLKRQ 
CesiCcDl.:2 FPFVIQILMGCEVLHNGTSHS FYQSVRDRHDLVRFNLATGEWVAAPGDEMAQRVRRSFSQDRGTSSRLRFLLONTCVAE ILS FAY YGKETLKRQ 
AncaCD1 YPFVIQSFFFCEIGTDGTKRGFYKGAANGDDVLGYSTDNATWVVQKDT PLAVAVODFLNRNKGTTANMRSLLLNECIDILESS LKTQNETILHRQ 
chcD1.1 YPFVFQSSIGCE aao anion LSAKAEH LMANASTLNEVIQVLUINDTEVDI LRLFIQAGKADLERR 
chcD1.2 YPRVFQIHTGCKLYT IRWSFVNIGEGGRDLVTYELSRERWVPORSTLLAKVMSNTLTDLRAVSGFLEHVFSSSFPNYILMLHEEGRTDLERR 
hucCDla YPFEIQVTGGCELHSGKVSGSFLQLAYQGSDFVSFQ GWLPYPVAGNMAKHFCKVL-NQNQHENDITHNLLS DTCPRFILGLLDAGKAHLORQ 
huCD1b YPFEIQGIAGCELHSGGAIVSFLRGALGGLDFLSV finfovesracconaaneen LI-IQYQGIMETVRILLYETCPRYLLGVLNAGKADLORQ 
huCDic YPFEVQVKAGCELHSGKSPEGFFQVAFNGLDLLSFQ} WVPSPGCGS LAQSVCHLLNHQYEGVTETVYNLIRSTCPRFLLGLLDAGKMYVHRQ 
huCD1d : YPLELQVSAGCEVH PGNASINN FFHVAFQGK DILSFQGTSWEPTQEAPLWVNLAIOQV: L-NQDKWTRETVQWLIINGTCPO FVSGLLESGKSELKKQ 
huCDle : YPFE ORLECERMN-- APOL FLNMAYQGSDFLSFQGISWEPSPGAGIRAQNICKVL-NRYLDIKE THOSHEGHT EPR FLAGLMEAGESELKRK 
N5 N6 N7 N8 
AlsiCD1.1(p): ERPVAVVFAQQPPVASELPLLLVCRVTGFYPRLIHVAWLRDGEELPPGLGINSTELLPNTDLTYQLRVVLAVD-PGAGHRYACHVEHSSLGGHSLVIPW 
Croicpi-i : ERPVAVVFARQPPITSELPLLLVCRVTGFYPRPIRVTWLRDGEEVPPGPG LLPNADLTYQLRSVLAVD-LGAGHRYACHVEHSSLGGHSLVIPW 
AlsicpD1.2 ERPVAVVFARQSPITVELPLLLVCWVTGFYPRPIHVTWLRDGEEVTPGPGINSSGLLPNADLTYQLRIVLAID-PGAGHSYACRVEHSSLGRQGLVVHW 
Craicpi,2 : ERPVAMVFARQPPIASKLPLLLVCRVTGFYPRPIHMTWLRDGEEVPPGPGUNSSGLLPNADLTYQLRIVLAID-LGAGHS YACRVEHSSLGSRGLVVHW 
AncaCD1 : EKPVAVVFAQEP-PATTDSLLLVCQVTGFYPHLINVSWLQD-EVALPSSR ILPNYDLTYQIRSSLAIKSMETSHSYVCRIQHSSLDGKSLVILW 
chep. i : VPPMAVVFAR---TAGQAQLLLVCRVTSFYPRPIAVTWLRDGREVPPSPALSTGTVLPNADLTYQLRSTLLVS-PQDGHGYACRVQHCSLGDRSLLVPW 
cChCD1..2 : VPPMAVVFAR---TAGQVQOLLLVCRVTSFYPRPIAVTWLRDGREVPPSPALSTGTVLPNADLTYQLRSTLLVS —PQDGHSYACRVQHCSLGDRSLLVPW 
huCDla : VKPEAWLSHG--PSPGPGHLOLVCHVSGFYPKPVWVMWMRGEQEQQ---GTQRGDI LPSADGTWYLRATLEVA-AGEAADLSCRVKHSSLEGQDIVLYW 
huCD1b : VKPEAWLSSG--PSPGPGRLOLVCHVSGFYPKPVWVMWMRGEQEQQ---GTQLGDI LPNANWTWYLRATLDVA-DGEAAGLSCRVKHSSLEGQDIILYW 
huCDic : VRPEAWLSSR--PSLGSGQLLLVCHASGFYPKPVWVTWMRNEQEQL---GTKHGDILPNADGTWYLQVILEVA-SEEPAGLSCRVRHSSLGGQDIILYW 
hucCDl1d : VKPKAWLSRG--PSPGPGRLLLVCHVSGFYPKPVWVKWMRGEQEQQ---GTQPGDILPNADETWYLRATLDVV-AGEAAGLSCRVKHSSLEGQDIVLYW 
huCDle : VKPEAWLSCG--PSPGPGRLOLVCHVSG cer ae eee ee DETWYLRAT BINA BGEARG US ERVEHSS LGGHDLI IHW 
™ 
AlsiCD1.1(p): ESRSPWKTK-VTVGILVTLLIVVMLVVAMA-YLOWRRRRYQDIS------~---~------------------------ 
CRSTCDL A. : ESRSHWKTN-VAVGILVTLLIVAMLVAALV-YLOWRCRTYQDIN---—---—-----~---------------------— 
ALSiCD1 2 GPGGHWEVG-LAVGIVISLLAAAAVAAVLW-WMRHRSMRPLL---------------------------------—- 
CrsiCD1..2 : GPGGNWGVG-LAVGIAISLLAAAGLAAVLW-WRRHRYTRPEQRDSMGL----------------------------- 
AncaCD1 : ERKHRYRVT-IVVVVLVASILVVVAGVLFYLQKKRRQOYEDVNQAISKTARQ-------------------------- 
GheEDi «i : —-EDSKWGLS-AGLGALLLLAAAAVAAVLVRRYRKRORVDEVRS I PLAEHRGTARDGTAAGOYGGCDRET PDEGRGHI 
chcD1.2 : --ENPSASSTVGITITILLLAAI ITG—-GIWWWRRRKHAGSGTDFRTFLI-----——----------------------- 
hucCDla : --EHHSSVGFIILAVIVPLLLLIGL--ALW-FRKRCFC------------ = 
huCD1b : ==-RNPTSIGSIVLATIVPSLLLLLC=LALW—YMRRRS YONI Pas 3 = 
huCDic : —--GHHFSMNWIALVVIVPLVILIVL--VLW-FRKKHCS YQDIL----------------------------------- 
huCD1d : -GGSYTSMGLIALAVLAC LLFLLIVGFTSR-FKRQTSYQGVL oe 
huCDle s —-GGY----SIFLILICLTVIVTLVILVVVD-SRLKKQR----——---------------- 


Fig. 5 Alignment of the «1, «2, and «3 domains of the amino acid 
sequences of all analyzed reptilian CD1s, chicken CD1s, and human 
CD1s. The a1 and «2 helices are marked with a “+”. The gaps in the 
alignments and partial sequences are filled with dashed lines. The 
conserved intramolecular disulfide bonds are indicated using black 
triangles. The letter “N” at the bottom of the sequences indicates the 
positions of the N-linked (NXS/T) glycosylation sites; those that have 


huCDld, SYQGVL). A similar sequence was also found in the 
cytoplasmic tails of the two murine CD1 proteins (SAY QDIR) 
(Blumberg et al. 1995). The motif YQXI/V (where X can be 
any amino acid) in the cytoplasmic tails may also be a signal 
for internalization and targeting to an endosomal compartment 
(Sandoval and Bakke 1994). Conversely, the motif YGGC 
was found in the cytoplasmic tails of chCD1.2 (Miller et al. 
2005; Salomonsen et al. 2005). In our study, an identical 
YQDI motif was found in the cytoplasmic tails of both 


been proven using the crystal structure or were predicted by NetNGlyc 
(http://www.cbs.dtu.dk/services/NetNGlyc/) are marked with rectangles. 
The potential glycosylation site in human CD1d is marked with a gray- 
shaded rectangle. N1 to N9 show the nine different N-linked 
glycosylation sites. The groove-forming residues in chCD1.2 (Zajonc 
et al. 2008) are marked in gray. The transmembrane region is marked 
above the line 


crocodilian CD 1.1 (Fig. 5). A variant motif, Y EDV, was found 
in the cytoplasmic tails of the green anole lizard CD1. 
According to the crystal structure analysis of 
chCD1.2, 23 groove-forming residues were found to 
be identical or similar to the human CD1 (Zajonc 
et al. 2008). In the reptilian CD1, these residues were 
found to be identical or similar to the human or chicken 
CD1 (Fig. 5). We performed structural modeling of the 
reptilian CD1 using SWISS-MODEL. The results 
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CrsiCD1.1 


Fig. 6 The structural modeling of the partial reptilian CD1. The PDB 
files used in analysis are 1ZT4 (human CD1d), 3JVG (chicken CD1.1), 
and 3DBX (chicken CD1.2). The structural modeling of the reptilian 
sequences was performed using SWISS-MODEL (http://swissmodel. 
expasy.org/). The cartoon representation was prepared using PyMOL. 


suggest that all of the analyzed reptilian CD1 have a 
dual-pocket (A’ and F’), similar to chCD1.1 and 
huCD1d (Fig. 6). 


The genomic locations of the reptilian CDI genes 


The genomic locations of chicken CD/ in the MHC 
region and human CD/ in a MHC paralogous region 
have previously been reported (Calabi and Milstein 
1986; Dascher and Brenner 2003; Miller et al. 2005; 
Salomonsen et al. 2005). The question instead becomes 
whether the reptilian CD/ genes are more similar to 
chickens or humans in terms of their genomic locations. 
Based on the NCBI genomic database, we found that, 
similar to the situation in chickens, the green anole liz- 
ard AncaCD1 and Chinese alligator AlsiCD1.1 genes are 
located in the MHC locus, whereas the third Chinese 


Fig. 7 A schematic diagram 
showing some annotated genes 
flanking the CD1 genes described 
in this study. The exact locations 
of these genes are shown in 
Supplemental table 2. Arrow 
shows the transcriptional 
orientation and the regions not 
included are represented as “||”. a 
Green anole lizard chromosome 
2. b Chinese alligator scaffold 
634_1 and scaffold 113_1 


fish-egg lectin-like 
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CrsiCD1.2 


The «xl and «2 helices are colored in cyan; the B-sheets are colored in 
red. The two pockets are indicated by A’ and F’. The A’ loop is marked in 
chCD1.2. Phe, which has a large benzene ring side-chain that can block 
the entrance of the F’ pocket and lead to a missing F’ pocket, is shown in 
chCD1.2 (b) 


alligator CD1 gene (the AlsiCD1.3, partial sequence) is 
located in a distinct MHC paralogous region (Fig. 7, 
supplemental table 2). In detail, the AlsiCD/.1 is locat- 
ed in scaffold 634 1 (GenBank accession No: 
NW_005842558.1), in which the MHC locus is also 
located. There are many CD1-like sequences observed 
flanking the AlsiCD/.1, some of which show more than 
80 % amino acid sequence identities with AlsiCD1.2, 
which is consistent with the results of the Southern 
blotting. A/siCD/.2 is found in scaffold 1413 1 
(GenBank accession No: NW_005843837.1), and there 
are no other genes predicted in this scaffold. The third 
Chinese alligator CD/ gene (AlsiCD1.3, GenBank ac- 
cession No: XP_006036211) is found in scaffold 
113_1 (GenBank accession No: NW_005842918.1), 
which contains similar genes to those located on human 
MHC paralogous chromosome 19 (Wan et al. 2013). 


CCDC105 
GABBR1 
NFKBIL1 
MHCI 


green anole lizard 
NC_014777.1 


~4.2Mb ~1.3Mb ~3.7 Mb 


cpt. 
ZBTB22 
MHCII 
PHF1 


Chinese alligator 
NW_005842558.1 


ARHGAP33 


Chinese alligator 
NW_005842918.1 
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An analysis of the green anole lizard genome database 
showed that AncaCD1 is located on chromosome 2 (GenBank 
accession No: NC_014777.1). Although the lizard MHC lo- 
cus is located on the same chromosome, the distance between 
the CD1 and MHC I gene is approximately 10 Mb. Many 
genes between CD1 and MHC I, such as GABBRI and 
AGPAT, belong to the MHC I, MHC II, or MHC III regions, 
and interestingly, other genes, such as RNF223, RPS5, 
CCDC105, and ZNF850 closely flanking the CD1 gene, are 
same to those located in the MHC paralogous regions on 
human chromosomes | and 19 (3 genes on chromosome 19, 
1 on chromosome 1), respectively. This shows that the lizard 
CD1 gene is more tightly linked to the MHC paralogous re- 
gion, although it is located on the same chromosome together 
with MHC genes. 


Discussion 


In the present study, we have identified CD1 genes in three 
reptiles including the green anole lizard, Chinese alligator, and 
Siamese crocodile, demonstrating that CD1 is ubiquitous in 
reptiles, birds, and mammals. However, CD1 genes differ 
considerably in gene number, isotypes, and genomic locations 
in these species, which, on the other hand, may allow us to 
track the evolutionary process of the CD1 genes. 

The types of reptile CD/ genes can be divided into two 
isotypes in Crocodylia, whereas the isotypes of CD/ genes 
in chickens and mammals are divided into two and five, re- 
spectively. Similar to the chicken CD1 isotypes, each of two 
crocodilian CD1 isotypes is not orthologous to any of five 
mammalian CD1 isotypes as revealed by phylogenetic analy- 
sis. The same thing also applies to the CD1 isotypes between 
crocodiles and chickens. This strongly suggests an indepen- 
dent diversification of CD1 isotypes during the speciation of 
mammals, birds, and reptiles. 

The finding that CD1 is ubiquitous in reptiles, birds, and 
mammals suggests that CD1 should have emerged in ancestral 
species common to reptiles, birds, and mammals. Given that 
CD1 gene was originated from MHC I gene (Dascher 2007; 
Kasahara 1999; Martin et al. 1986; Maruoka et al. 2005; Mill- 
er et al. 2005; Salomonsen et al. 2005), the major issue re- 
mains still unclear: when and how this genetic event occurred. 
As described previously in the “Introduction”, this is the most 
intriguing but a quite controversial issue for comparative CD 1 
studies up to date. There have been three main models pro- 
posed to address this puzzling issue, and all three models are 
based on the finding of four MHC paralogous regions in many 
vertebrates and 2R hypothesis for vertebrate evolution 
(Hokamp et al. 2003; Ohno 1970). Briefly, the first model 
assumes that CD1 was duplicated from MHC 1 in the primor- 
dial MHC, both of which are then distributed in different 
paralogous regions during 2R and additional genetic events 
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further deleted or silenced CD1 or MHC I in different 
paralogous region for different lineages (Salomonsen et al. 
2005). The second model hypothesizes that only MHC I was 
originally distributed to four paralogous MHC region during 
2R, followed by evolution of the class I gene to CD1 in one 
paralogous region, retention, or silencing of the class I genes 
in the other paralogous regions (Kasahara 1999). The third 
model differs from the second one by assuming that the evo- 
lution of MHC I to CD1 in one paralogous region occurred not 
in the early stage of vertebrate evolution but late in a close 
ancestor of birds and mammals (Miller et al. 2005). 

In favor of the first model are some evidence derived from 
this study. The CD1 genes in Crocodylia are located in two 
loci, respectively linked to the MHC region and a MHC 
paralogous region (corresponding to the MHC paralogous re- 
gion on human chromosome 19). In the green anole lizard, the 
CD1 gene is also more tightly linked to the same MHC 
paralogous region, although it is located on the same chromo- 
some together with MHC genes. Considering that the chicken 
CD1 is linked to MHC and human CD1 is located in the MHC 
paralogous region on chromosome 1 (Calabi and Milstein 
1986; Dascher and Brenner 2003; Miller et al. 2005; 
Salomonsen et al. 2005), CD1 can indeed be found in three 
of four different MHC paralogous regions albeit in different 
species. This seems to perfectly implicate that CD1 genes 
emerged together with the birth of the four MHC paralogous 
regions, and different species may retain CD1 genes differen- 
tially in these regions. However, it is hard to explain with this 
model why CD1 genes are not found in amphibians and fish. 

The pivotal difference between models 2 and 3 is when the 
CD1 gene evolved from MHC I in vertebrate evolution. The 
data available now favor model 3 more than model 2, since 
CD1 has not been identified in amphibians and fish. Taken all 
the available information together, a slightly modified hypoth- 
esis based on model 3 seems more reasonable: class I genes in 
the primordial MHC are distributed in different MHC 
paralogous regions during 2R, followed by retention of class 
I genes in one paralogous regions and silencing of the class I 
genes in the other three paralogous regions. In the MHC re- 
gion in a close ancestor of birds, reptiles, and mammals, MHC 
I was duplicated to generate CD1 by neofunctionalization. 
This could explain why the CD1 gene is linked to MHC re- 
gion in several species including chickens and crocodiles as 
well as the green anole lizard. Then, the chromosome translo- 
cation may account for the distribution of CD1 in other MHC 
paralogous regions in different species. Some clues to putative 
translocations can be derived the green anole lizard, as in this 
species, the CD1 is not only located on the same chromosome 
with MHC genes, but also tightly associated with genes locat- 
ed in MHC paralogous regions on human chromosome 19 and 
1. A similar hypothesis was also previously proposed by 
Dascher C (Dascher 2007), but the author assumes that the 
MHC I did not emerge in all four MHC paralogous region 
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originally but appeared in one paralogous region after emer- 
gence of jawed vertebrates. 

In summary, we have identified CD1 genes in several rep- 
tiles and deduced their isotypes and genomic locations. These 
data are helpful to understand the origin of CD1 genes in the 
context of MHC I gene evolution. Further analysis of gene 
components of MHC paralogous regions in more jawed verte- 
brates such as amphibians, teleost, and cartilaginous fish would 
be expected to generate more clues to this puzzling issue. 
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